Speed up Ape-X implementation on single machine

Alghough neural network is optimized for GPU, environments for reinforcement learning (e.g. simulator) are not always GPU friendly.

One of the method to speed up reinforcement learning is to run multiple explorers parallelly like Ape-X.

After long struggling, finally cpprb v9.4.2 supports multiple explorers learning like Ape-X on single machine. By utilizing the new classes MPReplayBuffer or MPPrioritizedReplayBuffer, users can speed up Ape-X implementation without thinking troublesome multiprocessing.

These buffer classes map internal data on shared memory, so that no proxy (e.g. multiprocessing.managers.SyncManagers) or queue (e.g. multiprocessing.Queue) are necessary for interprocess data sharing.

When using shared memory, locking is important to avoid data race. These classes automatically locks only critical section, which is much more efficient than simply locking the entire buffer.

The detail and example Ape-X implementation are described at document.

Avatar
Hiroyuki Yamada

My research interests include machine learning, cloud native conputing.

Related